Sequential Design of Experiments via Linear Programming

نویسندگان

  • Sudipto Guha
  • Kamesh Munagala
چکیده

The celebrated multi-armed bandit problem in decision theory models the central trade-off between exploration, or learning about the state of a system, and exploitation, or utilizing the system. In this paper we study the variant of the multi-armed bandit problem where the exploration phase involves costly experiments and occurs before the exploitation phase; and where each play of an arm during the exploration phase updates a prior belief about the arm. The problem of finding an inexpensive exploration strategy to optimize a certain exploitation objective is NP-Hard even when a single play reveals all information about an arm, and all exploration steps cost the same. We provide the first polynomial time constant-factor approximation algorithm for this class of problems. We show that this framework also generalizes several problems of interest studied in the context of data acquisition in sensor networks. Our analyses also extends to switching and setup costs, and to concave utility objectives. Our solution approach is via a novel linear program rounding technique based on stochastic packing. In addition to yielding exploration policies whose performance is within a small constant factor of the adaptive optimal policy, a nice feature of this approach is that the resulting policies explore the arms sequentially without revisiting any arm. Sequentiality is a well-studied paradigm in decision theory, and is very desirable in domains where multiple explorations can be conducted in parallel, for instance, in the sensor network context.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequential Bayesian optimal experimental design via approximate dynamic programming

The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dy...

متن کامل

Sequential DOE via dynamic programming

The paper considers a sequential Design Of Experiments (DOE) scheme. Our objective is to maximize both information and economic measures over a feasible set of experiments. Optimal DOE strategies are developed by introducing information criteria based on measures adopted from information theory. The evolution of acquired information along various stages of experimentation is analyzed for linear...

متن کامل

A programming method to estimate proximate parameters of coal beds from well-logging data using a sequential solving of linear equation systems

This paper presents an innovative solution for estimating the proximate parameters of coal beds from the well-logs. To implement the solution, the C# programming language was used. The data from four exploratory boreholes was used in a case study to express the method and determine its accuracy. Then two boreholes were selected as the reference, namely the boreholes with available well-logging ...

متن کامل

Robust Control via Sequential Semidefinite

This paper discusses nonlinear optimization techniques in robust control synthesis, with special emphasis on design problems which may be cast as minimizing a linear objective function under linear matrix inequality (LMI) constraints in tandem with nonlinear matrix equality constraints. The latter type of constraints renders the design numerically and algorithmically difficult. We solve the opt...

متن کامل

A TRUST-REGION SEQUENTIAL QUADRATIC PROGRAMMING WITH NEW SIMPLE FILTER AS AN EFFICIENT AND ROBUST FIRST-ORDER RELIABILITY METHOD

The real-world applications addressing the nonlinear functions of multiple variables could be implicitly assessed through structural reliability analysis. This study establishes an efficient algorithm for resolving highly nonlinear structural reliability problems. To this end, first a numerical nonlinear optimization algorithm with a new simple filter is defined to locate and estimate the most ...

متن کامل

1 8 Ju n 20 13 Sequential Design of Experiments via Linear Programming ∗

The celebrated multi-armed bandit problem in decision theory models the central trade-off between exploration, or learning about the state of a system, and exploitation, or utilizing the system. In this paper we study the variant of the multi-armed bandit problem where the exploration phase involves costly experiments and occurs before the exploitation phase; and where each play of an arm durin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/0805.2630  شماره 

صفحات  -

تاریخ انتشار 2007